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Abstract 

Many  interior-point  methods  for  linear  programming  are  based  on  the  prop¬ 
erties  of  the  logarithmic  barrier  function.  We  first  give  a  convergence  proof  for 
the  (primal)  projected  Newton  barrier  method.  We  then  analyze  three  types 
of  barrier  method  that  can  be  categorized  as  primal,  dual  and  primal-dual.  All 
three  approaches  may  be  derived  from  the  application  of  Newton’s  method  to 
different  variants  of  the  same  system  of  nonlinear  equations.  A  fourth  variant 
of  the  same  equations  leads  to  a  new  primal-dual  algorithm. 

In  each  of  the  methods  discussed,  convergence  is  demonstrated  without  the 
need  for  a  nondegeneracy  assumption.  In  particular,  convergence  is  established 
for  a  primal-dual  algorithm  that  allows  a  different  step  in  the  primal  and  dual 
variables. 

Finally,  we  describe  a  new  method  for  treating  free  variables. 

Keywords:  linear  programming,  barrier  methods,  interior-point  methods. 


1.  Introduction 


For  the  most  part  we  consider  linear  programs  in 

minimize  c^x 

X 

subject  to  Ax  =  b. 


the  following  standard  form: 


X  >  0, 


(1.1) 


where  A  is  an  m  x  n  matrix  wi*h  m  <  n.  We  focus  on  interior-point/barrier  methods 
to  solve  (1.1). 
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Primal-dual  methods  for  linear  programming 


Initially  we  prove  convergence  for  a  primal  barrier  algorithm  in  which  the  iter¬ 
ates  are  assumed  to  satisfy  Ax  =  6,  but  our  main  interest  is  in  primal-dual  algo¬ 
rithms  that  make  as  few  assumptions  as  possible  about  the  initial  approximation 
to  variables,  and  do  not  require  a  transformation  of  (1.1)  into  some  mathematically 
equivalent  linear  program.  We  also  allow  liberal  control  of  the  barrier  parameter. 

A  number  of  authors  have  described  primal-dual  algorithms  that  converge  in 
polynomial  time  (see,  e.g.,  Kojima,  Mizuno  and  Yoshise  [KMY89];  Monteiro  and 
Adler  [MA89]).  However,  such  algorithms  are  generally  theoretical  and  differ  from 
the  relatively  few  primal-dual  algorithms  that  have  been  implemented  for  practical 
problems  (see,  e.g.,  McShane,  Monma  and  Shanno  [MMS89],  Lustig,  Marsten  and 
Shanno  [LMS89,LMS90],  Mehrotra  [Meh90],  and  Gill  et  al.  [GMPS91]).  Two  key 
differences  are  the  assumption  that  the  step  taken  in  the  primal  and  dual  spaces  are 
the  same  and  that  the  initial  estimate  of  the  solution  is  primal  and  dual  feasible. 
It  may  be  argued  that  the  feasibility  assumption  is  not  overly  restrictive  because 
the  linear  program  can  be  transformed  into  another  problem  with  an  identical  so¬ 
lution,  but  a  known  feasible  point.  However,  this  ignores  the  possibility  that  the 
transformed  problem  may  be  more  difficult  to  solve  than  the  original. 

Recently,  Kojima,  Megiddo  and  Mizuno  [KMM90)  have  analyzed  a  primal-dual 
algorithm  that  is  more  similar  to  implemented  algorithms.  They  define  a  steplength 
rule  that  allows  (but  does  not  guarantee)  the  possibility  of  different  steps  in  the 
primal  and  dual  spaces.  They  assume  that  the  initial  point  is  feasible. 

The  principal  algorithms  considered  here  do  not  require  feasible  iterates,  and 
different  steps  may  always  be  taken  in  the  primal  and  dual  spaces.  These  algorithms 
may  be  loosely  categorized  as  primal,  dual  or  primal-dual  in  order  to  distinguish 
between  the  different  approaches.  Howevu,  all  of  them  are  primal-dual  in  the  sense 
that  this  term  has  been  used  for  interior-point  methods. 

It  is  not  within  the  scope  of  this  paper  to  provide  a  numerical  comparison  of 
the  different  methods.  Our  intention  is  to  give  the  methods  a  common  setting  and 
thereby  highlight  their  similarities  and  differences.  Our  main  purpose  is  to  define 
and  analyze  implementable  algorithms.  For  the  purposes  of  analysis,  it  is  necessary 
to  include  some  procedures  that  are  not  present  in  standard  implementations — 
the  most  notable  of  these  being  the  definition  of  the  steplength  as  the  result  of  a 
linesearch  instead  of  as  a  fixed  fraction  of  the  largest  feasible  step.  However,  the 
proposed  linesearches  are  simple  to  implement  and  do  not  add  significantly  to  the 
cost  of  an  iteration.  Moreover,  the  traditional  “fixed”  steplength  usually  satisfies 
the  linesearch  criteria.  The  proofs  of  convergence  demonstrate  that  almost  any  step 
C8J1  be  taken  in  the  dual  space.  The  existence  of  a  wide  range  of  steps  for  which 
convergence  occurs  may  be  the  reason  for  the  robustness  of  algorithms  that  do  not 
incorporate  a  linesearch. 

All  the  properties  discussed  apply  to  more  general  methods  for  problems  in  which 
some  variables  have  upper  bounds  or  are  free.  However,  if  the  linear  systems  arising 
in  the  methods  are  solved  using  certain  Schur  complements,  free  variables  become 
troublesome.  In  Section  8  we  describe  a  new  technique  for  the  treatment  of  free 
variables. 

The  analysis  presented  here  is  applied  only  to  linear  programming.  It  has  been 


Primal  Barrier  Methods 
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shown  by  Ponceleon  [Pon90]  that  it  may  be  generalized  to  indefinite  quadratic  pro¬ 
grams. 


1.1.  Notation  and  Assumptions 

Let  a;*  denote  a  solution  to  (1.1)  and  let  X*  be  the  set  of  all  solutions.  Throughout 
we  make  the  following  assumptions: 

(i)  the  constraint  matrix  A  has  full  row  rank; 

(ii)  the  feasible  region  5o  =  {a;  |  Ax  =  6,  x  >  0}  is  compact;  and 

(iii)  a  strictly  feasible  point  exists,  i.e.  there  exists  at  least  one  point  x  such  that 
Ax  =  b  and  x  >  0. 

We  shall  use  N  to  denote  the  matrix  whose  columns  form  a  basis  for  the  null  space 
of  A  (thus  AN  =  0).  Occasionally  it  will  be  necessary  to  refer  to  the  i-th  element  of 
a  sequence  of  vectors  {a:,}  and  the  j-th  component  yj  of  a  vector  y.  To  distinguish 
between  x,  and  t/j  we  shall  use  i  to  denote  the  i-th  member  of  a  sequence  of  vectors, 
and  j  to  denote  the  j-th  component  of  a  vector.  Unless  otherwise  stated,  ||  •  ||  refers 
to  the  vector  two-norm  or  its  induced  matrix  norm.  The  vector  e  denotes  the  vector 
(1,  1,...,!)^. 


2.  Primal  Barrier  Methods 

Barrier  methods  for  linear  programming  generate  approximations  to  both  the  primal 
and  dual  variables  at  each  iteration.  We  shall  use  the  term  primal  method  to  refer 
to  a  method  that  generates  strictly  positive  values  of  the  primal  variables  x,  but 
does  not  restrict  the  values  of  the  dual  slack  variables  z.  In  the  first  algorithm  we 
assume  that  the  primal  variables  are  feasible,  i.e.,  that  Ax  =  b.  This  assumption  is 
relaxed  for  the  remaining  algorithms. 

2.1.  The  Primal  Barrier  Subproblem 

Barrier  methods  involve  major  and  minor  iterations.  Each  major  iteration  is  asso¬ 
ciated  with  an  element  of  a  decreasing  positive  sequence  of  barrier  parameters  {/f/..} 
such  that  Wnik-^oot^k  =  0.  The  minor  iterations  correspond  to  an  iterative  process 
for  the  solution  of  the  subproblem 

n 

minimize  B(x,u)  =  c^x  -  u  In  x, 

x&it’'  ^  ^ ■'  (2.1) 

subject  to  Ax  =  b, 

which  is  solved  at  every  major  iteration,  i.e.,  for  each  value  of  /t  =  pk-  Since 
B{x,p)  is  a  strictly  convex  function,  there  exists  a  unique  minimizer  x*{p)  such 
that  Ax*{p)  =  b  and  x*{p)  >  0. 

Barrier  methods  are  based  on  the  fundamental  result  that  lim,,_.O'i’*(/0  G  A'*. 
For  a  proof  of  this  result  and  a  general  discussion  of  barrier  methods,  see  Fiacco 
and  McCormick  [FM68]. 
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The  special  form  of  the  derivatives  of  the  barrier  function  makes  Newton’s 
method  a  natural  choice  for  solving  problem  (2.1).  At  any  given  point  x,  New¬ 
ton’s  method  defines  a  search  direction  Ax  such  that  x  +  Ax  continues  to  satisfy  the 
linear  constraints  and  minimizes  a  quadratic  approximation  to  the  barrier  function. 
The  vector  Ax  is  the  solution  of  the  quadratic  program 

minimize  \Ax^HAx  +  g^Ax 

Ax 

subject  to  AAx  =  0, 


where  g{x,fi)  =  c  -  g.X~^e  and  H{x,g.)  =  are  VB{x,fi)  and  V^5(a:,/Li),  the 
gradient  and  Hessian  of  the  barrier  function,  with  X  =  diag(a:j).  If  y  denotes  the 
Lagrange  multiplier  vector  at  x  associated  with  the  constraints  Ax  —  b,  the  updated 
multipliers  y  +  Ay  &t  x  +  Ax  satisfy 


where 


(2.2) 


We  shall  refer  to  this  system  of  equations  as  the  KKT  system  and  to  the  matrix  K 
as  the  KKT  matrix. 


2.2.  The  Projected  Newton  Barrier  Method 

The  formulation  of  the  barrier  subproblem  (2.1)  and  the  calculation  of  x*{n)  by 
Newton’s  method  was  first  embodied  in  the  projected  Newton  barrier  method  of  Gill 
et  al.  [GMSTW86].  The  method  requires  the  specification  of  two  positive  sequences: 
a  bounded  sequence  {5^}  that  determines  the  accuracy  of  the  solutions  of  (2.1)  and 
a  decreasing  sequence  of  barrier  parameters  {pk]  such  that  limfc_oo/i;-  =  0. 

Algorithm  PFP  (Model  Primal  Feasible-Point  Algorithm) 

Compute  xo  such  that  Axq  =  b,  xq  >  0; 

Set  k  =  0,  i  =  0  and  4  =  0; 
while  not  converged  do 
Set  p  =  pk] 

while  ||iV^p(x,-,/i)||  >  Skp  do 
Find  x,+i  such  that 

B{xi+i,p)  <  B{xi,p),  Xf+i  >  0  and  Ax,+i  =  6; 

Set  i  =  i+  1; 
end  do; 

Set  k  =  k  +  1,  ik  =  i; 
end  do 

Each  member  of  the  subsequence  {xj*}  corresponds  to  an  approximate  minimizer 
of  the  subproblem  (2.1)  defined  ’oy  pk-  We  shall  refer  to  the  consecutive  indices  of 
the  sequence  of  minor  iterations  ik-i,  ik-i  +  1»  •  •  •  >  as  Ijt. 

Since  limfc_oo/^fc  =  0,  it  follows  that  limfc_oo  ll®i*  -  *tll  “  "'^^re  x’^  is  the 

nearest  point  to  Xi^  in  X* .  The  main  difficulty  lies  in  generating  the  sequence  of 


2.  Primal  Barrier  Methods 


minor  iterates  {x,-,  i  G  I*}  so  that  the  condition  ||iV^5(®t,/i)l|  <  Skfik  is  eventually 
satisfied.  This  issue  is  addressed  in  the  next  section. 

The  precise  form  of  the  termination  condition  for  the  minor  iterations  is  not 
crucial.  The  only  requirement  is  that  the  condition  be  satisfied  in  a  neighborhood 
of  x*{nk)  that  shrinks  to  the  point  x*{fik)  as  k  ->■  oo. 


2.3.  Convergence  for  the  Primal  Subproblem 

In  this  section  we  show  that  the  sequence  {x,-,  i  =  generated  by  Newton’s 

method  with  =  0  converges  to  x*{fik).  It  follows  that  for  5jt  >  0  the  number  of 
minor  iterations  required  *^o  satisfy  the  termination  condition  is  finite. 

Throughout  this  section  we  shall  use  the  notation 

fi  =  fik,  B{x)  =  B{x,n),  gix)  =  g{x,fj,),  H{x)  =  H{x,n), 

to  refer  to  quantities  associated  with  the  k-th  subproblem. 

The  feasible  set  So  is  compact  by  assumption.  Given  a  positive  constant  6  and 
a  feasible  vector  w  such  that  w  >  6e,  let  i7{w,  fi)  denote  the  level  set 

Q{w,g.)  =  {x  I  B{x)  <  B{w)]. 

We  have  in  mind  w  being  the  first  minor  iterate  x,-^_,  associated  with  fi  and  9  being 
the  smallest  component  of  w.  Every  subsequent  minor  iterate  will  lie  in  the  set 
So  n  Q{w,n). 

The  essential  element  of  the  proof  is  the  demonstration  that  the  KKT  matrix  is 
bounded  and  has  a  bounded  condition  number  at  every  point  in  the  set 

5  =  5o  n 

By  assumption,  A  is  bounded  and  has  a  bounded  condition  number.  It  follows 
that  K  will  also  have  this  property  if  H  is  bounded  and  has  a  bounded  condition 
number.  The  latter  properties  in  //  follow  from  the  following  lemma,  which  shows 
that  {(x,)j}  is  bounded  above  ana  bounded  awa.y  from  zero. 


Lemma  2.1.  Let  6  be  a  positive  constant,  and  let  w  be  a  given  vector  such  that 
w  >  6e  and  Aw  =  b.  There  exist  positive  constants  cxx  and  Tx,  independent  of  x, 
such  that  OxC  <  x  <  TxC. 


Proof.  The  set  S  is  compact  since  it  is  the  intersection  of  the  two  closed  sets  So 
and  Q,  and  it  is  a  subset  of  the  bounded  set  So-  Since  S  is  compact,  there  exists  a 
constant  Tx  such  that  Xj  <  Tx-  The  definition  of  S  implies  that  every  x  G  S  gives 
B{x)  <  B{w).  It  follows  that  for  all  x  G  5, 


n  n 

c^x  -  /i ^ In Xj  <  c^w  -  b' Wj . 
3=1  ;=i 


Therefore  for  each  j, 


n 

-/ilnxj  <  c^w  -  /X  ^IntUj  -  c^x  +  /i  ^In  Xr. 

j  =  l  rytj 
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Since  S  is  compact,  the  quantities  w  =  max{|c^i|  ]  x  €  5}..  and  /3  =  max{lna;j  |  a;  G 
5}  are  bounded.  Similarly,  if  0  >  0,  constant  0  =  max{)9,-  In^}  is  also  bounded, 
and  -/zlnxj  <  2a;  +  2n/z/?,  or  equivalently, 

Xj  >  >  0, 


as  required.  | 

Corollary  2.1.  Let  x  be  any  element  of  S.  Let  H{x)  =  where  X  =  diag(a:j). 
Then  there  exist  positive  constants  cth  and  Th,  independent  of  x,  such  that  for  all 
vectors  u, 

<  u^H{x)u  <  r„l|«lp.  I 

Lemma  2.2.  At  every  element  of  the  sequence  {x,-,  i  G  Ifc}  of  Algorithm  PFP,  the 
matrix  K  is  bounded  and  has  a  bounded  condition  number.  | 

We  now  show  that  the  sequence  {x,}  generated  by  Newton’s  method  converges 
to  x*{p),  which  implies  that  the  condition  ||iy^igf(x,)||  <  Skfi  will  be  satisfied  in  a 
finite  number  of  iterations. 

The  iterates  of  the  projected  Newton  barrier  method  satisfy  x,+i  =  x;  +  a,Z\x,’, 
where  the  search  direction  Ax,  is  defined  by  (2.2).  The  steplength  a,'  is  determined 
from  a  linesearch,  which  locates  a  steplength  that  gives  a  sufficient  decrease  in  D{x). 
Throughout  we  shall  use  the  Goldstein- Armijo  conditions  to  define  the  steplength, 
although  any  of  the  standard  steplength  criteria  would  be  suitable  (see,  e.g.,  Ortega 
and  Rheinboldt  [OR70]).  For  minimizing  5(x),  the  Goldstein- Armijo  conditions  are 

T]iaiAxJg{xi)  <  R(x,-  +  OiAxi)  -  B(xi)  <  Ti2aiAxJg{xi),  (2.3) 

where  0  <  772  <  r?!  <  1. 

Theorem  2.1.  Let  {x,}  be  the  sequence  generated  by  Newton’s  method  applied  to 
the  problem  (2.1).  Then  lim,_oo  ||a:,-  —  a:*(M)ll  =  0. 

Proof.  Since  a  strictly  feasible  minimizer  of  the  barrier  function  e.xists  along  Ax,, 
theie  must  exist  a  positive  step  a,’  such  that  x,-  OiAxi  is  a  strictly  feasible  point 
and  the  Goldstein- Armijo  conditions  are  satisfied.  Consequently,  jB(x,+i)  <  B(xi). 
Since  x;  G  S  and  S  is  compact,  it  follows  that  B(x,)  is  bounded  below  and 

lim  {5(x.+i)  -  5(x,)}  =  0.  (2.4) 

1— KX) 

Let  B{x)  denote  the  Hessian  matrix  of  B(x).  Since  x,-  G  S,  the  corollary  to 
Lemma  2.1  implies  that  there  exists  a  positive  constant  such  that 

AxjH{xi)Axi  >  a„||i4x,|p.  (2.5) 


From  (2.2)  we  have 


AxjH{xi)Axi  =  -Axjg{xi). 


3.  Getting  Feasible 
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Combining  this  identity  with  (2.5)  and  the  second  Goldstein- Armijo  inequality  (2.3) 
gives 

B{xi)  -  B{xi+i)  >  -r}20iiAxJg{xi)  >  7j20HQ:iH^a:,jp, 

which  implies  that  lim,_,oo  o;,j|/la;,jp  =  0  from  (2.4). 

The  Taylor-series  expansion  of  5{x,-  +  a,\4x,)  gives 

5(x,+i)  =  B(xi)  -f-  aiAxfg(xi)  +  ^aiAxjH{xi)Axi, 

where  x,'  =  x,-  +  OaiAxi  for  some  0  <  ^  <  1.  Using  this  expansion  in  the  first 
Goldstein-Armijo  inequality  (2.3)  gives 

aiAxJg{xi)  -1-  \aiAxjH{xi)Axi  >  r)ijciAxfg{xi), 

and  since  Axjg{xi)  <  0,  we  have 

i4*rs(»()i  <  (2.6) 

Since  S  is  convex,  x,-  €  S  and  it  follows  from  the  corollary  to  Lemma  2.1  that  there 
exists  a  constant  Th  such  that 

AxJjI{xi)Axi  <  T„||/ix,|p. 

Combining  this  inequality  with  (2.6)  gives 

\^xj9{xi)\  <  •^^^^AxjH{xi)Axi  <  ~^^ai\\Axif. 

Since  lim,_oo  aji|/la;,jp  =  0,  we  obtain 

lim  Axjg{xi)  =  0.  (2.7) 

*—*00 

From  (2.2)  we  have 

N'^H{x,)NAx:,^  =  -N^gix,),  (2.8) 

where  Axi  =  N Axf,,-  Since  N^II{x{)N  is  bounded  and  has  a  bounded  condition 
number,  it  follows  from  (2.7)  and  (2.8)  that  lim,_oo  Axi  =  0  and  lim,_co  N^g{x,)  = 
0.  Since  x*{fi)  is  the  unique  feasible  point  for  which  N'^g{x*{^))  =  0,  we  have 
lim:_,oo  j|x,-  -  x*(/i)j|  =  0  as  required.  | 

3.  Getting  Feasible 

There  are  various  ways  to  eliminate  the  requirement  of  Algorithm  PFP  that  an 
initial  strictly  feasible  point  be  known. 
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3.1.  An  Artificial  Variable 

A  common  approach  is  to  introduce  an  additional  variable,  or  set  of  variables,  and 
minimize  a  composite  objective  function.  For  example,  given  xo  >  0,  consider  the 
transformed  problem 

minimize  c^x  +  of 

subject  to  Ax  +  ^u  =  b,  x  >  0,  ^  > -1, 

where  u  =  {b  -  Axo)/||6  -  Axo||  and  p  is  a  positive  scalar.  The  initial  value  of  ^ 
is  ||6  -  Axoll,  so  a  strictly  feasible  point  for  the  transformed  problem  is  known.  If 
a  step  would  make  ^  negative  during  an  iteration,  a  shorter  step  is  taken  to  make 
^  =  0.  Once  (  is  zero,  it  is  eliminated  from  the  problem. 

The  difficulty  with  this  and  similar  approaches  lies  in  choosing  the  value  for  the 
parameter  p.  Although  p  must  be  sufficiently  large,  if  it  is  chosen  too  large,  the 
infeasibilities  dominate  the  objective  function  and  the  method  behaves  like  a  two- 
phase  algorithm.  If  no  strictly  feasible  point  exists,  the  efficiency  of  the  algorithm 
can  depend  critically  on  the  choice  of  p. 


3.2.  A  Merit  Function 


The  method  of  Section  2.1  may  be  generalized  so  that  Ax  is  the  solution  of  the 
quadratic  program 

minimize  lAx^HAx  +  g^Ax 

Ax  ^ 

subject  to  AAx  =  b  -  Ax 


and  satisfies 


/  Ax 

V  ~^y 


-9  +  \ 

b-  Ax  j 


(3.1) 


We  may  then  introduce  a  merit  function  that  balances  the  aims  of  minimizing 
B{x,p)  and  reducing  some  norm  of  Ax-b.  For  example,  one  possible  merit  function 
is 

M{x,p)  =  B{x,p,)  +  p\\Ax  -  b\\i, 

where  p  is  chosen  suitably  large.  It  can  be  shown  that  if  Ax  is  defined  by  (3.1)  then 
it  is  a  descent  direction  for  M{x,p)  (see  Section  7).  We  may  prove  convergence  in  a 
manner  similar  to  that  for  the  feasible-point  algorithm.  It  may  be  thought  that  this 
approach  also  depends  on  choosing  a  “good”  value  for  the  parameter  p.  However,  p 
affects  only  the  steplength  and  not  the  direction  of  search.  Moreover,  it  is  relatively 
trivial  to  adjust  p  dynamically.  We  can  take  the  step  we  would  like  to  take  and 
then  check  whether  a  suitable  value  of  p  exists  for  which  the  linesearch  criteria  are 
satisfied. 

We  shall  not  pursue  this  approach  with  respect  to  primal  barrier  algorithms, 
since  we  think  a  better  approach  is  outlined  in  the  next  section.  However,  we  shall 
return  to  this  merit  function  when  we  discuss  a  primal-dual  method  in  Section  7. 


Jj.  A  Primal  Primal-Dual  Method 
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3.3.  Newton’s  Method  Applied  to  the  Optimality  Conditions 

Since  is  strictly  convex,  a:*(/i)  is  the  only  strictly  feasible  constrained  sta¬ 

tionary  point  for  problem  (2.1).  Therefore,  an  alternative  method  for  finding  x*{^) 
is  to  use  Newton’s  method  for  nonlinear  equations  to  find  the  stationary  point  of  the 
Lagrangian  function  L{x,y)  =  B{x,n)  ~  y^{Ax  -  b).  Since  the  gradient  of  L{x,y) 
is  zero  at  we  obtain  the  n-\-  m  nonlinear  equations 

(3.2) 

whose  Jacobian  is  closely  related  to  the  KKT  matrix  K.  The  solution  of  the  KKT 
system  (3.1)  is  a  descent  direction  for  ||VZ»|j,  and  a  steplength  may  be  chosen  to 
achieve  a  sufficient  reduction  in  ||Vil|.  As  in  Algorithm  PFP,  this  merit  function 
ensures  that  Xj  cannot  be  arbitrarily  close  to  its  bound. 

We  now  extend  this  approach  to  obtain  the  principal  algorithms  of  interest  in 
this  paper. 


4.  A  Primal  Primal-Dual  Method 

Following  common  practice,  we  introduce  a  third  vector  of  variables  z  =  c  -  A^y. 
We  now  wish  to  solve  the  nonlinear  equations  fp{z,x,y)  =  0,  where 


(0 

^  z-nX-'^e  \ 

/I 

= 

c  -  A^y  -  z 

V  ; 

y  Ax  -  b  j 

(4.1) 


When  it  is  necessary  to  consider  the  full  vector  of  variables  z,  x  and  y,  the  vector  v 
will  denote  the  {2n  -f  m)- vector  (2,  x,  -y).  The  symbols  fp{z,x,y)  and  fp{v)  will 
be  used  interchangeably  for  fp,  depending  on  the  point  of  emphasis.  The  Newton 
direction  Av  =  {Az,  Ax,  -Ay)  satisfies  the  linear  system 


JpAv  —  —fp, 


where 


Jp 


'  I  0  \ 

-I  0 

^  0  A  0  J 


(4.2) 


Apart  from  the  last  block  of  columns  being  multiplied  by  -1,  Jp  is  the  Jacobian  of 
the  nonlinear  equations  (4.1).  We  shall  refer  to  Jp  as  the  Jacobian. 

The  directions  Ax  and  Ay  from  (4.2)  are  identical  to  tho.se  defined  by  the  KK'f 
system  (3.1),  and  to  those  associated  with  (3.2).  However,  for  the  nonlinear  equa¬ 
tions  VL{x,y)  =  0  and  fpyZ,x,y)  =  0,  the  steplength  is  chosen  to  produce  a  suffi¬ 
cient  decrease  in  l|VT|p  and  ||/plp  respectively.  In  the  latter  case,  the  Goldstein- 
Armijo  conditions  give  the  following  conditions  on  o,: 

0  <  -2r]2aiAv'^jJ fp{v,)  <  \\fp{v,)\f  -  ||/p(?;.+i)|l2  <  -2qya,Avf]l fp{v,). 
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Since  JpAv  =  -  fp{vi),  these  conditions  can  be  restated  in  the  equivalent  form 


0  <  2rj2ai  <  1  - 


WfriviW 


<  27/10,-, 


which  is  easily  tested. 

Since  the  residuals  /  and  r  are  linear  in  x,  y  and  z,  they  are  simply  related  to 
their  values  in  the  previous  iteration.  Suppose  that  r  and  /  are  nonzero  at  iteration 
7.  After  a  step  of  Newton’s  method  with  steplength  a,-,  we  have 


r,-+i  =  (1  -  a,-)r,-  and  fi+i  =  (1  -  o, •)/,•. 


(4.3) 


At  the  first  iteration  jl^oH  and  ||7/o|l  are  bounded  and  xo  is  bounded  away  from  zero, 
which  implies  that  the  Jacobian  is  bounded  and  has  a  bounded  condition  number. 
It  follows  that  oo  >  0.  Hence  the  relations  (4.3)  imply  that  r,-  =  7,-ro  for  some  scalar 
7,-  such  that  0  <  7,-  <  1.  If  a  unit  step  is  taken  at  any  iteration,  /  and  r  will  be  zero 
for  all  subsequent  iterations. 

The  complete  algorithm  is  as  follows. 


Algorithm  PPD  (Model  Primal  Primal-Dual  Algorithm) 

Set  Vo,  with  aio  >  0  and  Zq  >  0; 

Set  k  =  0,i  =  0  and  ik  =  0; 
while  not  converged  do 
Set  n  =  fik; 

’/hile  \\fp{vi,fi)\\  >  Skfi  do 
Find  Vi+i  such  that 

||/p(Vi+i,/^)|P  <  \]fp{vi,n)\\'^  and  a:,-+j  >  0; 

Set  7  =  7+1; 
end  do; 

Set  A:  =  fc  +  1, 7fc  =  7; 
end  do 


4.1.  Convergence 

The  convergence  proof  for  this  algorithm  is  similar  to  that  for  Algorithm  PFP  in 
that  it  is  necessary  to  show  that  for  each  barrier  subproblem,  Jp  remains  bounded 
and  has  a  bounded  condition  number.  However,  in  Algorithm  PFP  the  iterates  he 
in  So,  whereas  here  it  is  not  obvious  that  the  iterates  {a:,-}  lie  in  any  compact  set. 
We  establish  this  fact  in  the  next  lemma  and  then  show  that  {v,-}  lies  in  a  compact 
set. 

Lemma  4.1.  Let  Tr  denote  a  positive  constant.  If  the  feasible  region  So  is  compact, 
then  so  is  the  set 

5^  =  {i  I  a;  >  0,  ||A®  -  6||  <  Tr}.. 

Proof.  Since  Sa  is  closed,  it  only  remains  to  be  shown  that  Sa  is  bounded.  Since  the 
elements  of  w  are  nonnegative,  it  follows  that  Sa  will  be  compact  if  e^w  is  bounded. 


4.  A  Primal  Primal-Dual  Method 
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Let  w  be  any  member  of  Sa  and  let  r^,  denote  the  residual  Aw  -  b  associated  with 
w.  Let  w*  be  a  solution  of  the  linear  program 

maximize  e^w 

(4.4) 

subject  to  Aw  =  6  +  r^,  w>0. 

It  follows  that  Sa  is  compact  if  e^w*  is  bounded. 

Let  R  denote  a  full-rank  matrix  whose  columns  form  a  basis  for  the  range  space 
of  A^.  It  follows  that  w  may  be  expressed  as 

w  =  Nwff  -1-  Rwji.  (4.5) 

In  particular,  w*  =  Nw"^  Rw*^,  and  substitution  in  (4.4)  gives 

ARw\  =  b  +  r^. 

Since  ||ru,||  <  and  AR  has  full  column  rank,  this  equation  implies  that  |lw’^||  is 
bounded.  Equation  (4.5)  now  implies  that  w*^^  is  a  solution  of 

maximize  c^Wn 

“'w  (4.6) 

subject  to  Nw.s'  >  -Rw%. 

Assume  that  the  linear  program  (4.6)  is  unbounded.  Then  there  must  exist  a 
nontrivial  feasible  direction  u  such  that  Nu  >0.  If  ®  is  any  point  in  So,  then 
X  -I-  7iVu  must  also  be  in  5o  for  any  positive  7,  which  contradicts  the  compactness 
of  So-  Consequently,  the  solution  of  (4.6)  is  bounded  and  Sa  is  compact.  | 


Lemma  4.2.  Let  ro  denote  the  residual  ro  =  Axo  -  b,  with  xo  >  0.  Define  the  set 
Sr  =  {{z,x,y)  I  X  >  0,  Aa;  -  6  =  7ro  } 
for  some  7,  0  <  7  <  1,  and  the  level  set 

=  {{z,x,y)  1  \\fp{z,x,y)\\  <  tj}. 

Then  the  set  S  =  Sr  Ci  rfrj^y.)  is  compact. 

Proof.  Throughout  this  proof  we  shall  assume  that  u  is  a  vector  in  5.  From  the 
definition  of  Sr,  we  have  ||  Ax  -  6|j  <  ||ro||  and  it  follows  from  Lemma  4.1  that  the  x- 
components  of  v  are  bounded.  It  remains  to  be  shown  that  the  y  and  z  components 
of  V  are  bounded.  Note  that  the  components  of  both  /  and  /  are  bounded  since 
they  are  components  of  the  bounded  vector  fp. 

Consider  the  equations  f  ■=■  z  -  pX~^e  of  (4.1).  Premultiplying  f  by  x^  and 
using  the  fact  that  both  x  and  /  are  bounded,  we  see  that  there  exists  a  constant 
Ti  such  that 


x^z  =  x^f  +  /in  <  Ti . 


(4.7) 
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Also,  since  a:  >  0, 

Zj  >  fj  >  -T2  (4.8) 

for  some  positive  constant  T2. 

If  a:^  is  now  applied  to  the  second  equation  of  (4.1),  /  =  c  -  A^y  -  z,  we  obtain 
a:^/  =  -  x^A^y  -  x^z  =  x^c  -  (6^+  r'^)y  -  x'^z. 

Simple  rearrangement  gives 

-  (6^4-  r^)y  =  x^f  +  x^z  -  a:^c,  (4.9) 

and  it  follows  from  (4.7)  and  the  bounds  on  /  and  x  that 

-  (6^+  r'^)y  <  T3.  (4.10) 

Similarly,  using  a:  =  xq  in  (4.9)  gives 

(6^+  rj)y  =  -xlf  -  xlz  +  xlc  <  -x^f  -  Y^{xo)jZj  +  x^c, 

J- 


where  J_  is  the  set  of  indices  of  the  negative  elements  of  z  (recall  that  the  elements 
of  xq  are  positive).  It  follows  from  (4.8)  that 

{b^+ro)y<T4.  (4.11) 

Using  (4.10)  and  the  assumption  that  r  =  yro  for  some  0  <  7  <  1  gives 

-  {b^+'rro)y  <r3.  (4.12) 


Combining  (4.11)  and  (4.12)  gives 


-b^y  < 


73  +  774 

1-7 


<  T5. 


It  now  remains  to  bound  the  term  x*'^z.  Using  (4.9)  with  x  =  x*  gives 

x*^z  =  x*^c  -  x*^f  —  b^y. 


Since  x*  >  0  and  ||x*||  is  bounded  (see  Lemma  2.1),  all  the  terms  on  the  right-hand 
side  of  this  expression  are  bounded,  with  x*^z  <  tq  for  some  positive  constant  rg. 
Lemma  2.1  also  implies  the  existence  of  positive  constants  cr^  and  such  that 
(^x  <  X*  <  Tx.  Clearly  \\zj\\  is  bounded,  with 

zj  <  (re  +  nTxT2)/(Tx. 

Since  A  has  full  row  rank,  the  bounds  on  |1/||  and  ||x||  in  the  equation  /  =  c-A^y-z 
imply  that  ||j/|l  is  bounded,  ;is  required.  | 


Lemma  4.3.  If  v  £  S  then  Jp  is  bounded  and  has  a  bounded  condition  number. 


5.  Summary  of  Primal  Methods 


IS 


Proof.  It  is  enough  to  show  that  xj  is  bounded  away  from  zero  if  v  £  S.  We  have 
from  (4.1)  that  Zj  -  fj  =  ft/xj.  Hence 

l^jl  +  ll/ll  > or  equivalently 

It  follows  from  Lemma  4.2  that  there  exists  a  positive  constant  Tz  such  that  \zj\  <  Tz 
for  all  u  G  5,  and  by  assumption,  1|/||  <  tj.  Hence,  Xj  >  hI{tz  +  tj)  >  0. 

From  Lemma  4.1,  x  is  uniformly  bounded  above.  Since  Xj  is  bounded  away  from 
zero,  Jp  is  bounded  and  the  condition  number  of  Jp  is  bounded.  | 

The  proof  of  the  following  theorem  is  similar  to  that  for  Theorem  2.1. 

Theorem  4.1.  Let  {v,}  be  the  sequence  generated  by  Newton’s  method  applied  to 
the  equations  (4-i)-  Then  lim,_oo  ||u,-  -  u*(/z)||  =  0.  | 

It  follows  that  Newton’s  method  generates  a  point  that  satisfies  the  condition 
^  ^  finite  number  of  iterations. 

5.  Summary  of  Primal  Methods 

In  all  the  algorithms  considered  so  far  (excluding  the  artificial-variable  method  of 
Section  3.1),  the  search  directions  for  x  and  y  are  the  same  as  those  given  by  (4.2). 
The  steplength  a  may  be  chosen  to  reduce  one  of  the  following  functions: 

(i)  M{x,p)  =  B{x,fj,)  p^Ax  -  6||i  (search  in  i-space). 

(ii)  ||c  -  pX~^e  -  -1-  ||i4x  -  bW^  (search  in  x  and  y-space). 

(iii)  \\c  -  z  -  A^y\\'^  -b  \[Z  -  pX~^e\^  -b  \\Ax  -  6|p  (search  in  a:,  y  and  z-space). 

The  only  additional  restriction  on  a  is  the  requirement  that  x  -b  aAx  >  0.  In  all 
Ccises,  approximations  in  the  x,  y  and  ^-space  may  be  generated  even  though  they 
are  necessary  only  in  (iii).  Thus,  all  three  methods  may  be  viewed  as  primal-dual 
algorithms. 

If  some  steplength  other  than  a  is  taken  along  Az  and  Ay,  a  sequence  of  auxiliary 
y  and  a:- values  can  be  generated  that  approximate  y*  and  z* .  For  this  sequence,  a 
different  step  az  in  the  y  and  2:-space  is  needed  to  maintain  ^  >  0.  Since  Oz  is  not 
usually  equal  to  a,  a  dual  feasible  point  may  be  forind  before  a  primal  feasible  point 
(or  vice  versa).  Provided  that  the  step  taken  in  the  i/-space  is  also  az,  once  a  dual 
fecisible  point  is  found,  all  subsequent  approximations  will  be  dual  feasible. 

One  advantage  of  (ii)  and  (iii)  is  that  it  is  not  necessary  to  compute  logarithms. 
Moreover,  it  is  not  necessary  to  define  a  parameter  p  that  balances  feasibility  and 
optimality,  although  it  may  be  advantageous  to  weight  the  norms  occurring  in  (ii) 
and  (iii). 
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6.  Dual  Methods 


The  dual  of  the  linear  program  (1.1)  may  be  written  as 

minimize  —  b^y 

y.* 

subject  to  c  -  A^y  -  z  =  0,  >  0. 


(6.1) 


The  dual  barrier  subproblem  is 


minimize 
yes’",  26S" 

subject  to 


c 


n 


}=i 

-  A^^y  -2  =  0. 


(6.2) 


Newton’s  method  applied  to  this  problem  defines  the  j/-space  search  direction  from 
a  system  similar  to  (2.2).  (The  right-hand  side  is  changed  and  H  =  (l//i)Z^,  where 
Z  =  diag(2_,).)  Given  an  initial  feasible  point  {yo,  2o)  we  may  define  a  dual  algorithm 
DFP  analogous  to  PFP. 

Similarly,  we  may  construct  an  algorithm  based  upon  the  optimality  conditions 
for  (6.2): 


X  -  fiZ~^e  =  0, 

c  -  A'^y  -2  =  0,  (6.3) 

Ax  -  b  =  0. 

As  noted  by  Megiddo  [Meg89],  the  solution  of  these  equations  is  identical  to  the 
solution  of  (4.1).  Newton’s  method  applied  to  (6.3)  solves  the  linear  system  JdAv  = 
-  /c,  where 


/ 

f) 

^  X  —  pZ~^e  ^ 

'  /rZ-2 

/ 

0  ^ 

f 

= 

c  -  A^y  -  2 

and  Jp  = 

-/ 

0 

V 

r  j 

Ax  -  b  ! 

1  0 

A 

0  / 

The  resulting  algorithm,  DPD,  is  identical  to  PPD  except  that  Jp  and  fp  are  re¬ 
placed  by  Jp  and  fp,  and  the  2-variables  are  restricted  during  the  linesearch  instead 
of  the  x-variables.  It  can  be  shown  that  like  Jp,  the  matrix  Jp  remains  bounded  and 
has  a  bounded  condition  number.  Moreover,  a  step  satisfying  the  Goldstein- Armijo 
conditions  must  exist,  since  ||/c||  would  be  infinitely  large  if  any  element  of  z  were 
zero.  Note  that  whenever  every  component  of  z  is  positive,  Av  is  a  descent  direction 
for  Wfpr. 

As  in  algorithm  PPD,  an  auxiliary  sequence  can  be  generated  by  allowing  the 
the  primal  and  dual  steplengths  to  be  different.  In  this  case,  the  sequence  would  be 
a  strictly  positive  approximation  to  x*. 

Theorem  6.1.  Let  {v,}  be  the  sequence  generated  by  Newton’s  method  applied  to 
the  equations  (6.8).  Then  lim,_oo  il^'j  -  v*(m)II  =  0- 


7.  Primal-Dual  Algorithms 
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Proof.  Let  v  —  {z,x,  -y).  It  follows  immediately  from  z  >  0  and 

-liZ~^e  +  X  =  / 

that  Xi  >  -Ti  >  —00  and  Lemma  4.1  implies  that  x  lies  in  a  compact  set.  An 
identical  argument  to  that  used  in  Lemma  4.2  shows  that  v  also  lies  in  a  compact 
set.  It  follows  immediately  that  ||Z"*||  is  bounded.  Hence  Jo  is  bounded  and  has  a 
bounded  condition  number.  The  required  result  follows  from  an  identical  argument 
to  that  of  Theorem  2.1.  | 


7.  Primal-Dual  Algorithms 
7.1.  A  Primal-Dual  Method 

Algorithms  PPD  and  DPD  both  generate  a  sequence  of  approximations  to 
In  addition,  solves  the  system  of  equations  /pjj(2,x,?/)  =  0,  where 


JpD{z,x,y)  = 


(J\ 

f 

\  / 


^  Xz  -  lie  '' 
c  -  A^y  -  z 
^  Ax  -b  y 


(7.1) 


Newton’s  method  for  these  nonlinear  equations  gives  the  linear  system 

'  X  Z  0  \ 

JpoAv  =  -fpD,  where  Jpp  =  -/  0  , 

^  0  A  0  y 


(7.2) 


which  has  been  used  by  Lustig,  Marsten  and  Shanno  [LMS89,LMS90],  Mehrotra 
[Meh90],  and  Gill  et  al.  [GMPS91]  (see  also  Lustig  [Lus88]).  Methods  based  on 
the  solution  of  (7.2)  are  usually  referred  to  as  primal-dual  algorithms  because  both 
X  and  z  are  maintained  to  be  positive.  It  must  be  stressed  that  this  terminology 
does  not  imply  any  direct  connection  between  (7.2)  and  the  primal-dual  form  of 
LP.  If  the  latter  is  transformed  using  a  barrier  function,  the  resulting  optimality 
conditions  involve  six  sets  of  variables  and  two  independent  systems  of  equations 
that  are  identical  to  (4.1)  and  (6.3). 

Unlike  Jp  and  Jp,  Jpo  is  independent  of  p.  If  a  is  chosen  to  maintain  sufRcieiit 
positivity  in  both  x  and  z,  Jp^  will  be  a  bounded  matrix  with  a  bounded  condition 
number.  A  key  difference  with  these  equations  is  that  it  is  no  longer  obvious  that 
if  a  is  chosen  to  satisfy  the  Goldstein-.Armijo  conditions  then  a  suitalde  step  tliat 
maintains  both  z  >  0  and  x  >  0  exists.  We  therefore  propose  an  algorithm  that 
takes  a  different  step  in  the  x-space  than  in  the  y  and  z-spacc  and  uses  M{x,p) 
as  a  merit  function  rather  than  ||/pj|^.  If  ctz?  ty  and  Tz  are  preassigned  positive 
constants,  let  Sy  and  be  the  sets 


■5V  =  {2/ 1  lll/ll  <  Ty)  and  5z  =  {2  I  0  <  (7ze  <  z  <  Tzc}. 
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Algorithm  PD  (Model  Primal-Dual  Algorithm) 

Set  Vo,  with  xo  >  0,  20  €  Sz  and  yo  e  Sy; 

Set  k  =  0,i  =  Q  and  4  =  0; 
while  not  converged  do 
Set  fi  =  fik] 

while  ||A^5f(x,-,/i)||  -f  ||r||  >  do 

Select  any  £  Sz  and  j/,+i  G  Sy; 

Solve  JpnAvi  =  - fpo  for  Axi\ 

Find  Xj+i  =  Xi  -f  aiAxi  such  that 
M(x,+i,p)  <  M{xi,p)  and  x,+i  >  0; 

Set  z  =  i  -f- 1; 
end  do; 

Set  k  =  k  1,  ik  =  i', 
end  do 

The  convergence  of  Algorithm  PD  follows  directly  if  it  can  be  shown  that  (7.2) 
generates  a  sequence  {x;}  converging  to  x*{p,). 

Given  positive  constants  and  Tm,  define  the  level  set 

5  =  {x  I  ||Ax  -  6|1  <  Tr,  M{x,p)  <  Tm}. 

Similar  arguments  used  to  prove  Lemma  4.1  show  that  S  is  compact. 

Lemma  7.1.  If  x  &  S  then  there  exists  a  positive  ax,  independent  of  x,  such  that 
X{  >axe.  I 

Lemma  7.2.  Given  positive  constants  Tr,  Ty,  Tx  and  Tz,  assume  that  x,  y  and  z 
satisfy  ||r||  =  ||Ax  -  b\\  <  \\y\\  <  Ty,  0  <  x  <  TxB  and  0  <  z  <  TzC..  Then  there 

exist  constants  p,  7  ('y  >  0)  and  0  (P  >1)  such  that 

/lx^VM(x,p)<-7||iV^5|p-/Ji|riii, 

where  Ax  is  defined  by  (7.S). 

Proof.  If  the  system 

(r  "')(-;)-(-'r-7*)  "» 

is  solved  for  Ax  and  Ay,  it  follows  from  the  assumptions  that  l|^x||  is  bounded. 

Observe  that  the  right-hand  side  of  (7.3)  is  identical  to  that  of  (2.2).  It  follows 
that 

HnAxs  =  -N^{g  +  ZX-'^A^Axx),  (7.4) 

where  g  =  VB{x,p),  Hu  -  N'^ZX~^N,  Ax  =  NAx^  -b  A'^Ax^,  and 

AAJ^Axa 


(7.5) 


7.  Primal-Dual  Algorithms 
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Consider  now  the  merit  function  M{x,p)  given  in  Section  3.  By  definition  we 
have 

M{x,p)  =  Ax^N^{g  +  pA^e)  +  Ax^A(g  +  pA^e), 

where  e  is  a  vector  whose  elements  are  one  in  magnitude  and  whose  signs  are  the 
same  as  r.  Define  u  =  {AA^)~^A(I  -  X~^ZNHj^^N^)g.  Substituting  for  Ax^ 
from  (7.4)  and  Ax^^  from  (7.5)  gives 

Ax^VM{x,p)  =  -g^NH^^N^g  -  r^u  -  pr^e, 

where  7  >  0  is  the  reciprocal  of  the  largest  eigenvalue  of  JIn,  /?  >  1)  and  p  is  chosen 
such  that  ^ 

p  —  max(l  -  0)> 

with  Uaj  the  vector  u  evaluated  at  the  point  i  6  5  for  which  r\  has  its  minimum 
value.  I 


Lemma  7,3.  If  Ax  is  defined  by  (7.2),  there  exist  ijositive  a  and  such  that  the 
Goldstein- Armijo  conditions  are  satisfied  with  x  +  aAx  >  cr^e. 

Proof.  If  is  the  largest  feasible  step  along  Ax,  then  M{x  +  a^iAxyp)  is  infinite 
and  it  follows  that  there  exists  a  positive  number  a*  that  solves  the  problem  of 
miuo  M {x  +  aAx,p)  subject  to  a:  +  aAx  >  0.  Hence,  a  strictly  feasible  point  exists 
for  which  the  Goldstein- Armijo  conditions  are  satisfied.  | 


Lemma  7.4.  Let  Oz,  Ty  and  Tz  be  preassigned  positive  constants.  Consider  se¬ 
quences  {zi}  and  {y,}  such  that  (TzC  <  z,  <  TzC  and  j|y,j|  <  Ty.  Let  {r,}  denote 
the  sequence  a;,+i  =  x;  -t-  a,Ax,  and  xq  >  0,  where  Ax,  is  defined  by  (7.1)  and 
a,'  satisfies  the  Goldstein- Armijo  conditions  on  M{x,p)  with  the  requirement  tlia' 
a;, -4.1  >  0.  If  p  is  sufficiently  large  (but  bounded)  then  lim,_ooa;i  =  x*{p). 

Proof.  Since  {a;,}  lies  in  a  compact  set,  it  follows  that  x,  is  bounded  for  all  i. 
Moreover,  since  x',-  lies  in  S,  there  exists  a  positive  Ox  such  that  x,  >  Oxe  for  all  i. 
Every  element  of  the  sequence  {x,}  satisfies  the  assumptions  of  Lemma  7.2  and  we 
have 

AxJVM{xi,p)  <  -ii\\N'^gixi)\\‘^  -  /9||7’(x,)|1i, 

where  7  >  0  and  /?  >  1.  It  follows  from  Lemma  7.2  that  {M{x,,p)}  is  a  strictly 
monotonically  decreasing  sequence.  Since  {x,}  6  S,  it  follows  that  {jU(x,,/?))  luiini 
converge  and  the  Goldstein-Armijo  conditions  give 

lim  aiAxlVM{xi,p)  =  lim  a,(7llA^'^'/(a;i)||^  + /3||7'(a;,)||i )  =  0. 

t— *00  I— >00 

The  proof  now  follows  a  similar  argument  to  that  given  in  Lemma  2.1.  | 
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Lemma  7.5.  If  the  assumptions  and  definitions  of  Lemma  7.4  hold  then 
lim  yi  +  Ay,  =  y*{p)  and  lim  zi  +  Az{  =  z*{p). 

t— *00  t— >00 

Proof.  It  follows  from  (7.3)  and  the  optimality  conditions  of  (2.1)  that 
lim  yi  +  Ayi  =  y*{p)  and  lim  Ax,  =  0. 

t— ^00  I— *00 

From  (7.2)  we  have  Z,-(x,-  +  Axi)  =  -XiAzi  +  pe.  Since  lim,_ooa;i  =  x*{p)  and 
lim,_oo  Axi  =  0,  we  have 

lim  Zi  +  Azi  =  pX*{p)  ^e  =  z*{p), 

»-+oo 

where  X^{p)  =  diag((a:^(/x))j).  | 

The  above  result  shows  that  even  for  quite  arbitrary  choices  of  {zi}  and  {yi}, 
approximations  to  y*{p)  and  z*{p)  may  be  obtained. 

In  practice,  certain  choices  of  Zi  and  y;  lead  to  more  efficient  algorithms.  The 
primal  algorithm  of  Section  3.2  using  the  merit  function  M{x,p)  may  be  viewed  as 
being  equivalent  to  Algorithm  PD  with  Zi  chosen  as  pXf^e  and  y,+i  =  t/j  +  aiAiji. 
Since  ||Ax  -  6||i  is  implicitly  bounded  by  the  linesearch,  Lemma  4.1  implies  that  Xi 
is  bounded.  It  follows  that  each  (z,)j  is  bounded  away  from  zero,  and  Zi  £  Sz  for 
suitably  small  Oz. 

Alternatively,  values  of  y  and  z  may  be  determined  from  a  linesearch.  A  steplength 
6,  in  the  z  and  i/-space  can  be  taken  as  an  approximate  solution  of  the  univariate 
problem 

minimize  \\fpD{zi  +  OAzi,  a:,-,  y,  +  0^y,)|p 
9 

subject  to  Zi  +  9Azi  >  r}pXf^-^e,  0  <  I, 
where  rj  Ls  some  preassigned  constant  in  (0, 1]. 

7.2.  Another  Primal-Dual  Algorithm 

A  second  primal-dual  algorithm  can  be  derived  by  observing  that  v*{p)  solves  the 
system  of  equations  fpDD{z,x,y)  —  0,  where 


(f) 

(  pX-^Z-^-e  \ 

fpDD{z,x,y)  = 

f 

= 

c  -  A^y  -  z 

{  r  ) 

Ax  -h  ) 

Newton’s  method  for  these  equations  gives  the  linear  .system  JpooAv 
where 

I  -pz-'^x-^  -px-^z-'^  C  \ 


J 


PDD  — 


V 


-/ 

0 


0 

A  0  ) 


(7.G) 


■L 


PDDi 
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Unlike  the  primal-dual  method  of  Section  7.1,  there  always  exists  a  steplength  that 
satisfies  both  the  Goldstein-Armijo  conditions  for  ||/pdd11^  and  the  restrictions  x 
aAx  >  0  and  z  -J-  aAz  >  0. 

The  proof  of  the  following  theorem  is  similar  to  that  of  Theorem  6.1. 

Theorem  7.1.  Let  {u,}  be  the  sequence  generated  by  Newton’s  method  applied  to 
the  equations  (7.6).  Then  lim,_<x)  ||ui  -  n*(/i)||  =  0.  | 


At  first  sight,  the  Jcicobian  Jpdd  looks  more  complicated  than  Jp^.  However, 
the  KKT  matrix  for  the  system  is  identical  to  that  for  the  primal-dual  method  of 
Section  7.1.  We  have 


/  zx-^  V 

[a  j  \  -Ay 


/ 

\ 


c  -  2^  -t-  —Z"^x  -  A^y 
Ax  -b 


\ 

I 


(7.7) 


Hence,  the  search  direction  for  the  PD  algorithm  of  Section  7.1  may  bo  computed 
with  little  additional  effort.  A  better  direction  can  perhaps  be  constructed  from  the 
two  search  directions.  The  precise  combination  can  be  made  dynamic  and  need  be 
specified  only  after  both  directions  are  known. 

The  right-hand  side  of  (7.7)  is  identical  to  the  right-hand  side  for  the  KKT  system 
of  the  dual  algorithm.  This  implies  that  this  algorithm  is  related  to  the  dual  barrier 
algorithm  in  the  same  way  that  the  primal-dual  algorithm  of  Section  7.1  is  related 
to  the  primal.  For  example,  a  merit  function  based  on  the  dual  barrier  function 
and  dual  constraint  violations  would  enable  the  calculation  of  different  steps  in  the 
primal  and  dual  variables.  In  this  case  it  is  the  step  in  the  primal  variables  that 
may  be  chosen  arbitrarily. 

Note  that  any  linear  combination  of  the  systems  of  equations  (7.6),  (7.1),  (4.1) 
and  (6.3)  also  leads  to  a  similar  algorithm.  In  particular,  any  linear  combination 
that  includes  the  primal-dual  equations  (7.6)  (no  matter  how  tiny  a  proportion)  has 
the  property  that  a  suitable  steplength  exists  for  which  a:  and  z  are  positive. 


8.  The  Treatment  of  Free  Variables 

If  there  are  no  bounds  on  a  particular  variable,  no  difficulty  arises  provided  Av 
is  computed  directly  from  the  KKT  system.  However,  a  common  approach  for 
(oinputiiig  Av  is  first  to  compute  Ay  using  the  Schur  complement  of  the  leading 
diagonal  matrix.  If  the  leading  diagonal  -o  singular,  such  an  approach  cannot  be 
used.  (We  may  always  use  the  Schur  complement  of  the  nonjingular  portion  of  the 
diagonal  matrix  but  this  is  no  longer  a  definite  system.) 

We  shall  consider  just  the  primal  algorithm,  but  the  approach  suggested  licre 
may  be  used  in  all  the  methods.  For  simplicity,  assume  that  Xr  is  the  only  free 
variable.  In  place  of  {fp)r  =  Zr  -  p/xr  we  now  have  {fp)r  =  Zr,  with  z*{p)  =  0.  As 
long  as  the  explicit  KKT  system  is  solved,  the  effect  of  this  equation  on  the  Jacobian 
is  inconsequential.  However,  in  place  of  we  now  have  D,  where  dr  =  0  and 
dj  =  pxj'  for  j  ^  r.  Hence  D  is  singular  and  its  Schur  complement  does  not  exist. 
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A  means  of  circumventing  this  difficulty  is  to  replace  the  equation  =  0  by  an 
equation  that  ensures  0  as  n  0.  For  example,  we  could  use 

e®'"Zr  =  P  or  Zr  +  pXr  =  0, 

which  give  dr  =  Zr  and  dr  =  p  respectively.  In  the  first  example,  z*(p)  = 
and  we  may  keep  Zr  >  cr^  >  0.  It  follows  that  D  is  nonsingular  and  its  Schur 
complement  exists.  In  the  second  example,  since  Zr  does  not  appear  in  the  Jacobian, 
its  sign  or  magnitude  is  not  important.  Likewise,  the  nonsingularity  of  Jp  no  longer 
depends  on  Xr,  so  there  is  no  need  to  restrict  the  steplength  for  this  variable. 

9.  Further  Comments 

A  common  practice  in  interior- point  implementations  is  to  define  the  steplength  as 
some  fixed  percentage  of  the  maximum  feasible  step.  By  contrast,  all  the  algorithms 
described  in  this  paper  require  some  form  of  linesearch  for  the  steplength.  In  practice 
this  requirement  has  a  minimal  effect  upon  computation  time,  given  the  work  needed 
to  compute  the  search  direction.  Moreover,  if  %  is  close  to  one  and  rfi  is  close  to 
zero,  almost  any  step  is  likely  to  satisfy  the  Goldstein-Armijo  conditions  because  all 
the  linesearch  functions  are  convex  and  increase  rapidly  near  the  boundary  of  the 
feasible  region.  In  practice  we  have  observed  that  the  need  to  perform  a  linesearch 
arises  >  nly  when  there  is  significant  numerical  error  in  the  search  direction. 

Currently  the  most  efficient  implementations  use  a  predictor-corrector  method 
to  define  the  search  direction  (e.g.  [LMS90,Meh90]).  Such  a  strategy  may  be  incor¬ 
porated  in  the  algorithms  discussed  here.  The  important  point  is  to  be  able  to  fall 
back  on  a  guaranteed  method  should  the  predictor-corrector  direction  fail  to  be  a 
suitable  descent  direction  for  the  merit  function.  A  similar  view'  was  adopted  by 
Mehrotra  [Meh90]. 

It  has  not  been  our  intent  to  compare  the  various  algorithms  in  terms  of  perfor¬ 
mance.  All  the  primal-dual  algorithms  have  very  similar  theoretical  properties,  but 
only  the  primcd-dual  algorithm  of  Section  7.1  has  been  used  in  the  principal  known 
implementations  [LMS89,LMS90,Meh90,GMPS91].  The  key  system  of  equations  is 
“less  nonlinear”  than  for  the  other  three  variations.  Even  so,  in  the  neighborhood  of 
the  solution,  the  Jacobian  behaves  almost  identically  to  the  Jacobians  of  the  other 
systems  (as  does  the  KKT  matrix).  It  is  not  immediately  apparent  that  this  method 
is  inherently  superior  to  the  others. 

It  may  be  that  the  best  method  is  dependent  on  how  the  linear  systems  are 
solved.  For  example,  all  the  methods  may  be  implemented  by  solving  systems  of 
the  form  ADA^Ay  =  u,  where  D  is  either  Z~^  or  XZ"^.  Suppose  these 
systems  are  solved  using  a  conjugate  gradient  method  in  which  a  preconditioner  is 
based  on  periodically  forming  the  Cholesky  factors  of  ADAX .  The  systems  using 
D  =  should  yield  better  preconditioners  as  the  iterates  converge  because  the 
ratio  of  consecutive  values  of  any  significant  diagonal  of  D  converges  to  one.  When 
D  =  XZ~^  01  D  =  Z~^,  the  significant  diagonals  correspond  to  the  small  elements 
of  z.  It  is  not  obvious  that  the  ratio  of  consecutive  values  of  any  such  diagonal  will 
behave  as  smoothly. 
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Our  analysis  is  directed  at  the  linear  programming  problem.  However,  extending 

the  results  to  a  smooth  convex  objective  function  is  quite  straightforward.  The  more 

challenging  problem  is  to  extend  the  results  to  nonconvex  problems. 
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